CUBE CONNECT Edition Help

MPICORES case studies

The first thing the user must know is that the MPICORES parameter only provides parallel capabilities when used on a problem with multiple user classes, as this parameter essentially allows the number of user classes defined by the MPICORES parameter to be run simultaneously. We will again examine our three case studies here; performance results for using the MPICORES parameter on these are provided in Table 4.

Case 1: Medium-Size Static Problem (see Table 4) — For our first test case we find that the MPICORES parameter does a much better job of speeding up the solution time for the two-core case, but then does very little from two to four cores. In this case we find the initial benefit from executing the large portions of sequential code for each class in parallel, but also find a limitation of using the MPICORES when increasing cores. First, we note that though the MPICORES parameter may have been set to 4, the program reduced it to three cores because the MPICORES parameter can only run user classes in parallel and we only have three for this problem. The primary factory in only achieving an extra 3-second reduction when running on the extra MPICORE in this case is that the program can only run as fast as its slowest user class will let it. In this case we find that a single user class takes the full 90 seconds to run on its own on a single core, and so the program cannot perform any better than this without increasing the NCORES parameter.

Case 2: Medium-Size Dynamic Problem (see Table 4) — The second test case again shows good parallel results, and better with two MPICORES than two NCORES. Indeed, we see that the two user classes being run by this problem take nearly the same amount of time hence providing a good balance that allows the performance increase to approach near perfect speedup of 2X. However, since we only have two user classes for this problem, we cannot achieve with MPICORES the speed provided by using more NCORES.

Case 3: Large-Size Static Problem (see Table 4) — We find that our large static test case does not perform nearly as well by increasing the MPICORES value as we found by increasing the NCORES parameter. The large amount of time this problem spends in the optimization loop leads to the belief that we should see good overall performance with the NCORES parameter but we would still expect to see good run time reduction by increasing MPICORES considering the large number of user classes to be estimated. We see, however, that this is not necessarily so. The reason for this falls back to the lesson we learned above in Test Case 1 where we found our problem could only run as fast as our slowest running user class, and in this case we see that a single user class runs in the 85000 second range.

Table 4: Performance comparison for example problems with varying MPICORES parameter. Speedup represents the multiple over the base 1 core time. Case 1: Medium size static estimation problem with 3 user classes, 4681 zones and 954 screenline counts. Case 2: Medium size dynamic estimation problem with 2 user classes, 57 Zones, 12 time intervals, 44 screenline counts (class 1) and 42 screenline counts (class 2). Case 3: Very large static estimation problem with 9 user classes, 10,000+ Screenline counts, and 6000+ zones. The top number of each entry indicates the total run time in seconds and the bottom number represents the speedup multiple over a single core.

Two main points can be made from an examination of these test cases:

1. The use of MPICORES allows user classes to be run in parallel, which can lead to the best speedups since sequential code sections are performed in parallel.

2. A problem will only run as fast as its slowest running user class.

Having examined the use of both the NCORES and MPICORES parameters individually, we will now move to examine how they can be used together to provide the best possible performance gains. See next section, Using MPICORES and NCORES together.